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and Elena Khapalova 

Washington State University 

We derive exact computable expressions for the asymptotic dis- 
tribution of the change-point mle when a change in the mean oc- 
curred at an unknown point of a sequence of time-ordered indepen- 
dent Gaussian random variables. The derivation, which assumes that 
nuisance parameters such as the amount of change and variance are 
known, is based on ladder heights of Gaussian random walks hitting 
the half-line. We then show that the exact distribution easily extends 
to the distribution of the change-point mle when a change occurs 
in the mean vector of a multivariate Gaussian process. We perform 
simulations to examine the accuracy of the derived distribution when 
nuisance parameters have to be estimated as well as robustness of the 
derived distribution to deviations from Gaussianity. Through simula- 
tions, we also compare it with the well-known conditional distribution 
of the mle, which may be interpreted as a Bayesian solution to the 
change-point problem. Finally, we apply the derived methodology to 
monthly averages of water discharges of the Nacetinsky creek, Ger- 
many. 

1. Introduction. While modeling time-ordered data, one is concerned 
about the parameters of the model being dynamically stable. One way of 
addressing the dynamic instability of the model parameters is to model the 
time dependence of parameters through a possible change at an unknown 
time-point so that the parameters remain stable both before and after the 
unknown change-point. Clearly, the methodology is extremely important 
from a practical point of view, mainly because the changes in phenomena 
observed over time usually occur unannounced, such as change in the qual- 
ity characteristic of a manufacturing process, changes in water or air qual- 
ity overtime, changes in the pattern of stock market indices and so on. 
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The change-point problem allows modelers to detect the presence of any 
such unknown change-points and further capture them through either point 
or interval estimates. Such modeling has found applications from all areas 
of scientific endeavor, including environmental monitoring, global climatic 
changes, quality control, reliability, financial and econometric time series, 
and medicine, to name a few. For examples of real life applications, see 
Braun and Muller (1998) for application of change-point methods in DNA 
segmentation and bioinformatics; Fearnhead (2006), Ruggieri et al. (2009) 
for applications in geology; Perreault et al. (2000a, 2000b) for application 
in hydrology; Jaruskova (1996) for applications in meteorology; Fealy and 
Sweeney (2005) and DeGaetano (2006) for applications in climatology; Ka- 
plan and Shishkin (2000) and Lebarbier (2005) for applications in signal 
processing; Andrews and Ploberger (1994), and Hansen (2000) for applica- 
tions in econometrics; and Lai (1995), Wu, Cheng and Jeng (2005) and Zou, 
Qiu and Hawkins (2009) for applications in statistical process control. Even 
though there are recent advances in addressing multiple changes in scien- 
tific phenomena [see Fearnhead (2006), Fearnhead and Liu (2007), Giron, 
Moreno and Casella (2007) and Seidou and Ouarda (2007)], the classical 
change-point literature is most well developed in the case of a single un- 
known change-point in time-ordered processes. 

Classical change-point methods involve two fundamental inferential prob- 
lems, detection and estimation. Under the likelihood-based approach, the de- 
tection part is addressed through likelihood ratio statistics and their asymp- 
totic sampling distributions. Maximum likelihood estimation of an unknown 
change-point first begins with obtaining the mle as a point estimate. Interval 
estimates of any desired level, which are preferred over point estimates, can 
be constructed around the mle, provided distribution theory for the mle is 
available. However, distribution theory for a change-point mle can be analyt- 
ically intractable, particularly when no smoothness conditions are assumed 
regarding the amount of change. In contrast, advances in the Bayesian ap- 
proach to change-point methodology have been occurring at a faster pace. 
Ever since Markov chain Monte Carlo (MCMC) methods were seen as a 
tool for overcoming the computational complexities in Bayesian analysis, 
there has been rapid progress in the overall development of this important 
methodological tool, and advances in Bayesian change-point analysis have 
not lagged behind. 

While the classical change-point problem dates back to Page (1955), there 
has been a large amount of literature on the problem covering both detec- 
tion and estimation aspects. One may consult the monographs of Brodsky 
and Darkhovsky (1993, 2000), Basseville and Nikiforov (1993), Csorgo and 
Horvath (1997), Chen and Gupta (2000) and Wu (2005), as well as a rich 
collection of references in these monographs for a comprehensive account of 
various approaches to inference on change-point problems. In reviewing the 
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literature in terms of both theory and applications, it becomes clear that 
the detection aspect of the change-point problem attracted greater atten- 
tion than its counterpart of estimation. Perhaps this has not been acciden- 
tal, in that asymptotic theory for change-point estimators is technically a 
more challenging problem than deriving asymptotic distribution theory for 
change detection statistics. In an attempt to make estimation of the un- 
known change-point more accessible to practitioners, the main purpose of 
this paper is to derive exact computable expressions for the asymptotic dis- 
tribution of the maximum likelihood estimate (mle) of the unknown change- 
point when a change occurs abruptly in the mean only of a Gaussian process. 

Asymptotic distribution theory for the change-point mle in the abrupt 
case was first initiated by Hinkley (1970, 1971, 1972). While Hinkley (1970) 
derived the asymptotic theory for the change-point mle in a fairly general 
setup, the distribution was not in a computable form, and was primarily 
technical in nature. It turns out that Hinkley (1970) computed the distribu- 
tion for change in the mean of a normal distribution only through certain 
approximations. While Hu and Rukhin (1995) provided a lower bound for 
the probability of the mle being in error of capturing the true change-point, 
Jandhyala and Fotopoulos (1999) and Fotopoulos and Jandhyala (2001) de- 
rived upper and lower bounds and also suggested two approximations for 
the asymptotic distribution of the change-point mle. Similarly, Borovkov 
(1999) also provided only upper and lower bounds for the distribution of 
the change-point mle. Thus, despite the attempts of various authors, the 
problem of deriving computable expressions for the asymptotic distribution 
of the change-point mle remained unsolved to date. It is particularly strik- 
ing that exact computable expressions for the asymptotic distribution of the 
change-point mle have not been derived in the literature for even selected 
distributions of the underlying process such as the Gaussian and exponential 
distributions. 

Tackling this important problem, we derive in this article exact com- 
putable expression for the distribution of the change-point mle when a 
change occurs in the mean only of a univariate or multivariate Gaussian 
process. The derived asymptotic distribution is not only exact but is also 
quite elegant and can be computed in a simple and straightforward manner. 
In fact, the result we derive demonstrates that the second suggested ap- 
proximation in Jandhyala and Fotopoulos (1999) is the exact solution to the 
problem, in the Gaussian case. It should be pointed out that the distribution 
we derive assumes that the parameters of the distribution before and after 
the change-point are known. However, this should not pose difficulties, since 
Hinkley [(1972), page 520], in a theorem has shown that the asymptotic 
distribution of the change-point mle remains the same even for unknown 
parameter scenarios. From a practical point of view, this asymptotic equiva- 
lence result is extremely important. In practice, apart from the change-point 
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being unknown, the parameters before and after the change-point also in- 
variably remain unknown. The problem of deriving the distribution of the 
change-point mle when the parameters are unknown is the one that prac- 
titioners would be most interested, as opposed to the distribution of the 
change-point mle for the case when the parameters are known. There is no 
a priori reason to believe that the distributions of the change-point mle for 
the known and unknown cases be asymptotically equivalent. It is in this 
sense that the asymptotic equivalence result of Hinkley (1972) plays a key 
role for practitioners. One only needs to examine whether this asymptotic 
property holds well for reasonable sample sizes, and for this we carry out a 
simulation study in Section 4. 

Since the exact solution derived in the paper assumes Gaussianity, it is 
tempting to explore robustness of this exact computable expression when 
the true process deviates from Gaussianity. If the derived result is indeed 
robust to such departures, then it can be applied more widely than merely 
Gaussian processes. While a simulation study covering a wide class of non- 
Gaussian families of distributions may be of interest for practitioners, in 
this paper we pursue a limited robustness study by performing large scale 
simulations wherein the error process is assumed to be symmetric and follows 
the t-distribution, or asymmetric and follows the standardized chi-square 
distribution. In both cases, we change the degrees of freedom from being 
small to large, so that one approaches Gaussianity as the degrees of freedom 
become large. 

Hinkley's approach to deriving distribution of the change-point mle is 
perceived as the unconditional approach in the literature. Against this, Cobb 
(1978) proposed a conditional approach to the distribution of the change- 
point mle, wherein the distribution of the mle is derived by conditioning 
upon sufficient information on either side of the unknown change-point. 
Since the exact distribution of the unconditional mle is now available, it is 
relevant to compare the conditional and unconditional distributions in terms 
of their performance, including robustness properties. Thus, we have also 
included Cobb's conditional distribution in our simulations. As pointed out 
by Cobb (1978), since the conditional distribution of the change-point mle 
can also be interpreted as the Bayesian posterior for the change-point under 
a uniform prior on the unknown change-point, the comparisons between the 
two distributions have a broader appeal than what might appear at first 
glance. 

Finally, we apply the methodology derived in the paper to multivariate 
analysis of hydrological data. The data, previously analyzed in a univari- 
ate setup by Gombay and Horvath (1997), represents averages of log trans- 
formed water discharges for the Nacetinsky creek for the months of February, 
July and August during the years 1951-1990. The bivariate and trivariate 
change-point analysis shows that a significant increase has occurred in the 
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water discharges, whereas the univariate change-point analyses show no sig- 
nificant changes in the mean water flows. 

The organization of the paper is as follows. In Section 2 we present some 
general background regarding the change-point mle and its asymptotic dis- 
tribution. Then, we state the main theorem in Section 3, and the proof of 
the theorem is presented in Appendix A. While Section 4 consists of empir- 
ical assessment of the performance of derived theory for the case of known 
and unknown parameters, Section 5 contains the multivariate change-point 
analysis of the Nacetinsky creek data. Finally, Section 6 concludes the paper 
with a discussion. 



2. Distribution of the mle. Let Y%, Yi, . . . ,Y n , n > 1, be a sequence of 
real-valued independent time ordered random variables defined on a prob- 
ability space (f2, F, P). Let there be a natural number r n G {1, 2, . . . ,n — 1} 
such that Yi,Yz, . . . , Y Tn have a common distribution F\ , whereas the sub- 
sequent observations Y Tn+ i, Y Tn+ 2, ■ . ■ , Y n have a common distribution F2 
with Fi 7^ i*2- Here, the change-point r n is an unknown parameter and 
should be estimated. The likelihood function of r n is given by p n (Y;T n ) = 
E[I=i fi(Xi) nr=r n +i hi^i), where the functions f\ and fa are densities of F\ 
and F2, respectively, with respect to some dominating measure fj,(Fi,F2 <C 
n). In the sequel we assume that the densities f± and /2 are known, perhaps 
through known parameters. Following Hinkley (1970), the mle f n may be 
expressed as 

3 

(2.1) f n = argmaxV]a(Yj), 

l<i<n-in 

where a(Yi) = log{/i(Yi)//2(li)},i = 1, • ■ ■ ,n — 1. For establishing distribu- 
tion theory, it is convenient to work with f n — T n € {— T n + 1, . . . ,n — r n — 1} 
instead of f n . Hence, we have 

Tn+j 

(2.2) £ n = f n -T n = argmax a(Yj), 

-T n +l<j<n-T„-l i=1 

where the maximizer is a result of the following two-sided random walk T(-): 

3 3 

Y,a(Y*)=Y, X ? = S h j€{l,...,n-r n -l}, 

i=l i=l 

r n (j-,T n ) = <j 0, j = o, 

-3 -3 

i=l i=l 

(2.3) 
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Here, {Y, Yi : i > 1} and {Y*,Y* : i > 1} are two independent sequences with 
independent and identical copies on (R, R) such that Y is distributed ac- 
cording to F\, and Y* is distributed according to F 2 . Note that X and X* 
are real valued random variables defined on R. Also note that when F\ ^ F2 , 



E{X) = - / log{f 1 (x)/f 2 (x)}f 1 (x) l i(dx) = -K(f 1 J 2 ) 
Js 

= -E h {a{Y)} <0 and 



(2.4) 

E(X*) = [ log{f 1 (x)/f 2 (x)}f 2 (x) f i(dx) = -K(f 2 ,f 1 ) 
Js 

= E f2 {a(Y*)}<0, 

where K is the usual Kullback-Leibler information. It can be seen that (2.4) 
is also related to the entropy function, which in many instances is used for 
measuring the distinctness of probabilities. We assume that P(X > 0) > 0. 
For > 0, let 

(2.5) 0(0) = E{exp(0X)} and </>(0) = E{exp(0X*)}. 

Note that 0(0) = - 6). Moreover, 0(0) < 1, V0 G [0, 1], since 

0(A) = f h{x){h{x)/f 2 {x)}-\i{dx)= f fi- x (x)ti(x)fi(dx) 

(2.6) 

< 



J s fi{x)n(dx)\ ljj 2 {x)n{dx) 



It is known that when E(X) < 0, P(X > 0) > and 1? = sup{0 > : cp(9) < 
1}, the asymptotic behavior of the tail for the ultimate maximum, M = 
sup{S n : n G N}, can be described by the following three cases: 

(i) •& = 0, the tail has a polynomial form (sub-exponential case), 

(ii) •& > and < 1 an intermediate case, 

(iii) •& > and = 1 the Cramer's case. 

Now, in a sequence of observations for which F\ 7^ F 2 , the /^-derivatives 
also satisfy fx 7^/2- From (2.6), it is clear that the choice of $ greater 
than zero for which (iii) is satisfied is $ = 1 , the unity. Consequently, it fol- 
lows that X satisfies Cramer's condition. Furthermore, merely noting that 
tp{"&) = 4>(1 — #), it follows that X* also satisfies Cramer's condition. This 
observation implies that i? = 1?* = 1, in Proposition 1 of Jandhyala and Fo- 
topoulos (1999) for general distributions including Gaussian random vari- 
ables. 

It also follows that 4>{B) < 1, V0 G (0,1) and that <fi is strictly convex on 
9 G (0, 1). This suggests that 0(0) attains its minimum at a unique 9q G (0, 1) 
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such that </>(#o) = infg e ( ,i) <M$) < 1- This firmly establishes that assump- 
tions 1-3 in Jandhyala and Fotopoulos (1999) are no more required and 
that they hold naturally whenever F\ ^ F 2 , and P{X > 0) > are satisfied. 

In this paper we are interested in deriving the distribution of the limiting 
variable , by letting n — > 00 in such a way that T n — > 00 and n — r n — > 00 . 
In this regard, it has been shown that ^ is a proper random variable and 
Cn — > Coo a.s. [see, e.g., Fotopoulos and Jandhyala (2001)]. 

We begin by stating a theorem found in Fotopoulos (2009). For all pur- 
poses, this result is a restatement of Theorem 2 in Jandhyala and Fotopoulos 
(1999). 

Theorem 2.1. Let Fi / F 2 and P(X > 0) > 0. Then, the probability 
distribution of ^ is given by 



p(t+ = 00) |p(rf>-i) 

- J P(M* > x)P{T{ > -j n S-j edx)\, 

j < -1,-2,..., 
P(U =j) = { P(T+ = 00) P(T*+ = 00), j = 0, 

P(T* + = 00) j P(T*->j) 

- P(M > x)P{T*~ >jn S* edx)\, 

where Xj := inf{j > : Sj > 0}, T~ := inf{j > : Sj < 0} and M := maxo< n S n , 
and M* ,T-j* + and are defined in a similar manner. 

The convergence rate of the above asymptotic result is of interest for 
purposes of both theory and practice. Knowledge about the convergence 
rate allows one to judge the appropriateness of the sample size and other 
ancillary parameters for which the asymptotic distribution can be utilized 
for finite sample sizes without committing disproportional errors. In this 
regard, both Borovkov (1999) and Jandhyala and Fotopoulos (2001) derived 
important results that establish the convergence rate applicable to Theorem 
2.1. We state here some relevant facts from these articles and then formulate 
a theorem without proof that establishes a bound for the total variation 
distance between the finite sample and infinite sample distributions of the 
change-point mle. 

From Theorem 2 of Jandhyala and Fotopoulos (2001), we have 



sup \P(£n£B) 

, n 



P(Coc eB)\= P(Coo < -T n or £00 > n - r n ), 
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where B TnjTl is the Borel cr-field defined on Z Tni „ = {— r n + 1, . . . , 0, . . . , 
n — T n — 1}. Then, as argued in Jandhyala and Fotopoulos (2001), upon 
augmenting B Tnin into the Borel a-filed on Z, it follows that the total vari- 
ation distance between £ n and ^ defined by 

<hv(Z n ,Soo) = sup |P(£„ G B) - P^oo £ B)\ 
BeB 

may be seen to yield 

(2.7) d TV (£„,£oo) = P(^ oc < -T n Or > n - T n ). 

The following theorem, which provides a bound for G?Tv(£m £oo)i follows 
immediately upon applying (2.7) into Theorem 1 of Borovkov (1999). 

Theorem 2.2. Let F\ / F 2 and P(X > 0) > 0. Let £ n and ^ be the 
centered random variables of the change-point rale for finite and infinite 
samples, respectively. Then, the total variation distance between £ n and £oo 
admits the inequality given by 

dTv(£n,£oo) < 4max{^(fl o ) r "i(0o) n " T "}, 
where cf>(0 o ) = inf ee(01) (f)(9) < 1. 

Theorem 2.2 clearly establishes a geometric rate of convergence as £ n 
approaches £oo, asymptotically. The above result is more friendly from a 
computational point of view than Theorem 3 of Jandhyala and Fotopoulos 
(2001). 

While Theorem 2.1 provides the probability distribution of £oo, the ex- 
pressions therein are still only of technical interest. The main problem is 
that, as far as we know, a computable expression for the distribution func- 
tion M(x) [or M*(x)\ is not available in the literature. Clearly, the behavior 
of 1 — M(x) (or 1 — M*(x)) depends upon the characteristics of the under- 
lying distributions f\ and f 2 , in study. Moreover, the term P(T^~ = oo) that 
appears in both Theorems 2.1 and 2.2 may also be unavailable for compu- 
tation unless we know the exact distribution of S n , for all n G N. Thus, 
the determination of an exact expression for the distribution of M for any 
general distribution is beyond analytical scope, and consequently, an exact 
computable form for the probability distribution P(£,oo = j), j £ Z, in The- 
orem 2.1 is also analytically not tractable. To this extent, in this paper we 
shall concentrate on developing the analysis by assuming that the underlying 
process is of Gaussian type. 

3. Asymptotic distribution of the mle under Gaussian processes. We 

shall establish the main theorem regarding computationally accessible dis- 
tribution of £oo first under the univariate Gaussian case. Subsequently, we 
shall illustrate how the univariate case itself can be directly applied to the 
more general multivariate setup. 
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3.1. The univariate Gaussian case. We begin by assuming that the un- 
derlying process is univariate Gaussian, and the means before and after the 
change-point are given by /Ui,/i2, wherein we let ^ Hi. We do assume 
that the standard deviation a is known and remains the same throughout 
the sampling period. Clearly, the likelihood ratios in (2.1) may then be ex- 
pressed as 

X = -a(Y) = log{f 2 (Y)/f 1 (Y)} 

(3.1) = log/^^e-^-^W /_^ e -(^i)W 

(Mi-/x 2 ) 2 (pi-fMi)„ 
~ D 2V 2 a Z ' 

where Z ~ N(0, 1), and, similarly, 

(3.2) x = D ^r— + — ; — Z > 

where Z* ~ N(0,1), and is independent of Z. Note that in this case, the 
random variables X and X* are both identically distributed with means 
E(X) = E(X*) = -rj 2 /2 < and variances var(A) = v&r(X*) = rj 2 , where 

rj = Ml ^ represents the standardized amount of change. Hence, it is suffi- 
cient to confine our analysis to only one side of the random walk T(-). 

2 

Under the formulation in (3.1), it can be seen that S n =d — n \ — r]^/nZ, 
where again Z ~N(0, 1). Note [Asmussen (1987), Corollary 4.4] that when 
E(X) < 0, the ladder height distribution given by G + (dx) = P(S T + £ dx n 

T+ < oo) is defective. Thus, ||G+|| = P(T+ < oo) < 1 and — \— = 1 - 

||<jr_|_|| = P(T^~ = oo) = P(M = 0). We shall now state our main theorem, 
which provides a computable expression for the distribution of ^ . The com- 
putability of the terms in the expression will be demonstrated in the dis- 
cussion following the theorem. The proof of the theorem is presented in 
Appendix A. Subsequent to the theorem, we state a corollary, which es- 
tablishes a closed form computable expression for the bound in Theorem 
2.2. 

Theorem 3.1. Suppose that the time-ordered sequence Y\, Y 2 , . . . , Y n , 
n>l, is such that Yi ~ N(fii,a 2 ),i = l,...,r n , and Yi ~ N(fi2,cr 2 ), 
i = T n + 1, . . . ,n. Then, the probability distribution of ^ is given by 



P(^oo = k) 



(1 - HG+HX^i - \\G + \\q lkl ), k = ±1,±2, . . . , 
(1-IIG+ll) 2 , k = 0, 

where 1 - ||G+|| = exp{-E°Li jHwl/^)} and q k = E{I{T{ > k)}, q k 
E{e~ s "I(T^ >k)}, k = l,2,... and q = q = l. 
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It is fairly straightforward to state the bound in Theorem 2.2 for the 
Gaussian case. Specifically, it follows that the total variation distance in the 
Gaussian case admits 

(3.3) d T v(£n,£oo) < 4max|exp 

3.2. The multivariate Gaussian case. Here, we let {Y, :i G N} be a se- 
quence of time-ordered independent Gaussian elements defined on R rf , the d- 
dimensional Euclidean space with f(x; fidxi^dxd) denoting the correspond- 
ing probability density function. In the sequel, mainly for convenience, we 
represent the parameter only as (/i, X) by dropping the respective dimension 
subscripts. Let the parameter (/i,X) change from its initial value of (/^i,X) 
to (/X2, X), at some unknown index point T n € {1, 2, . . . , n — 1}, with mean 
vectors fJ,±,fJ,2 G ©, and common variance-covariance matrix X. For reason 
of convenience, we assume that X is positive definite and the mean vectors 
satisfy fi\ / /i 2 - 

The functional (x,y) denotes the usual inner product and the extended 
semi- norm is defined if there exists a covariance operator X such that \\x\\"^ = 
(Xx, x) . Then, we may write Y =d fJ-i + X 1//2 Z for all data before the change- 
point, where Z is a ti-variate standard normal vector. Consequently, the 
random variable X = — ln/(Y; fi±, Y>)/ f(Y; fi2, X) is expressed as 

x = i{(x- 1 (y- Ml ),y- w )-(x- 1 (y- M2 ) ) y- M2 )} 

(3-4) 

where Z now stands for the standard normal random variable with mean 
zero and variance one. 

Similarly, for data after the change-point, we have y = D ^ 2 + X X / 2 Z*, 
where Z* is the d-variate standard normal vector, and in this case, we obtain 

x* = in/(y ;w ,x)//(y ;M2 ,x) 

(3-5) 

=d — 2 llw — A^Hs-i + ||Mi — /^lls- 1 -^ , 

where Z* is univariate standard normal independent of Z. Upon letting 
V = llw — /^He- 1 represent the amount of standardized change in the means, 
it should be clear that the multivariate case translates itself into a corre- 
sponding univariate case with i] as defined above. 

4. Performance of the distribution of the change-point mle. In this sec- 
tion we wish to assess the performance of the derived asymptotic distribution 
in two different ways. First, we investigate the equivalence result of Hinkley 
(1972) and, second, we compare the derived distribution of the mle with the 
conditional distribution of mle as derived by Cobb (1978). 



/ r, fT n \ ( T] l (n-T n ) \\ 

(-— lexpl _u. 
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Table 1 

Total variation distances of known and estimated empirical distributions (based on 
500,000 simulations) from theoretical distribution of change-point mle in the univariate 

case 



n 


T 


v — 


1 


v — 


1.5 


v — 


2 


T] = 


2.5 


Known 


Est. 


Known 


Est. 


Known 


Est. 


Known 


Est. 


100 


20 


0.0106 


0.0665 


0.0070 


0.0264 


0.0033 


0.0139 


0.0014 


0.0082 


100 


30 


0.0113 


0.0493 


0.0065 


0.0205 


0.0032 


0.0104 


0.0021 


0.0057 


100 


40 


0.0112 


0.0437 


0.0065 


0.0189 


0.0033 


0.0091 


0.0020 


0.0050 


100 


50 


0.0109 


0.0412 


0.0068 


0.0176 


0.0040 


0.0082 


0.0022 


0.0044 


60 


20 


0.0105 


0.0721 


0.0070 


0.0298 


0.0033 


0.0155 


0.0014 


0.0086 


60 


30 


0.0112 


0.0641 


0.0065 


0.0271 


0.0032 


0.0133 


0.0021 


0.0076 


40 


20 


0.0104 


0.0852 


0.0070 


0.0383 


0.0033 


0.0191 


0.0014 


0.0105 



4.1. Distribution of the change-point mle for known and unknown parame- 
ters. The assumption of known parameters does not apply in practice, and 
it is common that they must be estimated from the data. While Hinkley 
(1972) has shown asymptotic equivalence of change-point mle under both 
known and estimated cases, its applicability to sample sizes of practical 
interest requires empirical evidence. This issue is perhaps even more impor- 
tant in the multivariate case, mainly because the multivariate case involves 
estimation of many more parameters. As discussed in Sections 2 and 3, for 
comparing the closeness of two distributions, we find it convenient to utilize 
the total variation distance measure, which for discrete random variables X 
and Y is given by d TY (X,Y) = \ £ ieZ \P(X = i)-P{Y = i)\. 

Simulations are performed by letting the parameter choices for sample size 
and true change-point be as follows: n = 40, r = 20; n = 60, r = 20; n = 60, 
r = 30; n = 100, r = 20; n= 100, r = 30; n= 100, r = 40 and n = 100, 
r = 50. For each of the above cases, the choice of values for r\ are set at 
r) = 1.0,1.5,2.0,2.5. The results for univariate and bivariate cases based on 
500,000 simulations for each individual scenario are presented in Tables 1 
and 2, respectively. As one might expect, the situation of known parameters 
yields excellent agreement with the theoretical distribution in both tables, 
irrespective of the sample size as well as the location of the change-point. 
When parameters are estimated, the univariate case (Table 1) shows very 
good to extremely good agreement with the theoretical distribution. The val- 
ues, for even the bivariate case (Table 2), show very good agreement except 
when r\ is very small (j]= 1). 

4.2. Unconditional change-point mle against Cobb's conditional mle. Cobb 
(1978) derived conditional distribution of the change-point mle by condi- 
tioning upon sufficient observations around the true change-point, which 
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according to Cobb (1978) is also equivalent to the Bayesian posterior when 
the prior on the unknown change-point is uniform. If 5 denotes the number 
of data points to be considered on either side of f n , then Cobb's conditional 
solution for I £ {—5, ... ,5} is given by 

P(r n -r n = l\Yr n s+i, Yr n+S ) 

(4.1) 



^p n (Y;f n + l)/ ^p n (Y;r n + l). 
l=-S 

The method of choosing 5 is clearly detailed in Cobb (1978). It is then 
relevant to compare the unconditional distribution of the mle derived in 
Section 3 with the above conditional solution. Also, we investigate the ro- 
bustness of the exact limiting distribution for departures from normality 
through simulations, limiting the study to the univariate framework only. 
Here, incorporating both symmetric and asymmetric distributions, the error 
structures are modeled by the standardized t u and xt distributions. 

For simplicity, we let only 77 = 1.0 and 77 = 2.5, and then perform simu- 
lations for all the choices of sample sizes and true change-points considered 
in Section 4.1. The choices of v under t^-distribution were v = 5, 10, 20 and 
they were v = 1,5,20 under ^^-distribution. Note that while implementing 
Cobb's conditional solution, we determined the value of 5 so that the error 
rate detailed in Cobb (1978) is close to 10 -5 . To save space, we present the 
computed distributions (based on 50,000 simulations) in the form of figures 
only, and that too only for the case of n = 100, r = 50. Figure l(a-c) cor- 
respond to the cases of normal, t§ and Xi distributions when rj = 1.0, and 
Figure l(d-f) correspond to the same cases when r] = 2.5. 

Table 2 

Total variation distances of known and estimated empirical distributions (based on 
500,000 simulations) from theoretical distribution of change-point mle in the bivariate 

case 



n 


T 


v — 


1 


n — 


1.5 


V = 


2 


n — 


2.5 


Known 


Est. 


Known 


Est. 


Known 


Est. 


Known 


Est. 


100 


20 


0.0108 


0.0991 


0.0066 


0.0376 


0.0035 


0.0197 


0.0018 


0.0126 


100 


30 


0.0110 


0.0718 


0.0065 


0.0281 


0.0034 


0.0153 


0.0016 


0.0099 


100 


40 


0.0119 


0.0624 


0.0070 


0.0252 


0.0044 


0.0135 


0.0017 


0.0075 


100 


50 


0.0121 


0.0595 


0.0076 


0.0236 


0.0040 


0.0126 


0.0016 


0.0075 


60 


20 


0.0107 


0.1140 


0.0066 


0.0466 


0.0035 


0.0248 


0.0018 


0.0157 


60 


30 


0.0107 


0.1006 


0.0065 


0.0410 


0.0034 


0.0218 


0.0016 


0.0146 


40 


20 


0.0105 


0.1383 


0.0065 


0.0647 


0.0035 


0.0350 


0.0018 


0.0233 
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(a) Normal, n = 1.0 
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(b) f 5 ,n = 1.0 
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(d) Normal, n =2.5 
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(C) xl , n = i.o 
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(e) ? 5 ,n = 2.5 
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mle (estimated) cmle (estimated) 



Fig. 1. PZots of theoretical mle, empirical mle (known), empirical mle (estimated), em- 
pirical cmle (known) and empirical cmle (estimated) distributions of the centered change- 
point when n = 100, r = 50 under normal (a); is (b) and x\ (c) when rj = 1.0; and normal 
(d); t 5 (e) and Xi (f) ui/ien 7? = 2.5. 
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For the remaining cases, we summarized the computed distributions through 
Bias and mean square error (MSE), and to save space, we only describe the 
salient features of these computations. It can be seen from Figure 1(a) that 
in the normal case, the unconditional distributions under both known and 
estimated cases are almost identical and they closely agree with the theo- 
retical distribution even when change is small with r] = 1.0. While the dis- 
tributions of cmle under known and estimated cases are also quite identical 
to each other, there is more spread in the cmle, with the probability at the 
true change-point being substantially smaller than that of the unconditional 
mle. It is clear from Figure 1(b) and (c) that robust to deviations from nor- 
mality is quite pronounced even when degrees of freedom under t$ and x\ 
distributions are small. Moving on to rj = 2.5, we find from Figure l(d-f) 
that, overall, there is greater robustness and even better agreement between 
known and estimated solutions. 

Though not presented, the Bias and MSE values show some differences 
from known case to the estimated case, mainly when r] is small (rj = 1.0). 
The robustness for large changes (rj = 2.5) is extremely good throughout the 
computations, thus depicting good tail behavior for large changes under both 
t and x 2 distributions. Also, extreme behavior is noticed for the estimated 
case when rj = 1.0 and n = 100, r = 20. In this case, Cobb's cmle shows 
somewhat smaller MSE values than the mle, though only marginally. For all 
other parameter choices, the mle performs better in terms of MSE values. 

Finally, we noticed that the behavior of MSE values for mle in the known 
case are lower than the corresponding theoretical MSE values and that the 
MSE values increase with the sample size. This behavior can be explained 
by the fact that the theoretical distribution derived for infinite samples pos- 
sesses infinite domain, whereas the domain under finite samples is truncated 
by the sample size. This truncation effect for finite samples is found to be 
most pronounced when n = 40. The same argument also explains why MSE 
values in both tables increase with increasing sample sizes. 

5. Multivariate change-point analysis of water discharges at Nacetinsky 
creek. The Nacetinsky is a small creek in the German part of the Ergebirge 
Mountains. Gombay and Horvath (1997) analyzed the monthly averages of 
water discharges for the Nacetinsky creek during the years 1951-1990 and 
found that the lognormal distribution appropriately models the monthly 
average discharges in the creek. Consequently, applying the log transforma- 
tion, they applied likelihood ratio based change detection methodology in a 
univariate framework for detecting changes in mean only as well as changes 
in the variance only of the normal distribution for the transformed data. 
When changes were detected, they obtained point estimates of the unknown 
change-point by the value at which the likelihood ratio was maximum. In 
detecting the change points, Gombay and Horvath (1997) found that the 
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change-detection methodology under independence was applicable for the 
monthly water discharges. 

We revisited the monthly data and first analyzed the data in a univariate 
setup, mainly for detecting changes in mean only or variance only of the 
transformed data. Applying the respective likelihood ratio change-detection 
statistics (B.2) and (B.4) in Appendix B, we found no evidence of change 
in either the mean or in the variance for almost all months. We were then 
interested to learn whether bivariate or multivariate analyses might convey 
a different message than what has been learned from the univariate analysis. 
One can expect significant covariances in the water discharges among various 
months within a year, and it is of interest to know whether such covariances 
contribute significantly as one pursues change-detection and estimation. To 
this extent, we found that a multivariate analysis of the data for the months 
of February, July and August yields some interesting results. 

Change-point analysis, whether at the univariate level or at the multi- 
variate level, involves two parts, namely, change-detection and change-point 
estimation whenever a change-point is detected. The focus of this paper 
clearly is on estimation, where we derive computable expressions for the 
asymptotic distribution of the change-point mle. Change-detection is not 
pursued in the theoretical part of this paper. However, change-detection 
precedes change-point estimation for the analysis of data. Keeping this in 
mind, we first present analysis and results from change-detection in Ap- 
pendix B, and only results from change-point estimation will be emphasized 
in this section. Once again, our analysis in both detection and estimation is 
based on log transformed water discharges data for the months of February, 
July and August as reported in Figure 2. 

To proceed with the formulation, let Yi represent the log transformed 
monthly water discharges at the Nacetinsky creek for the months of Febru- 
ary, July and August for the for the ith year, i = 1, . . . , 40, so that in this case 
the dimension d = 3, and the sample size n = 40. We begin modeling the data 
by assuming that Y\,...,Y n are independent and that Yi ~ N{y^ l \ E),i = 
1, . . . , n. Under the change-point setup with r n as the unknown change-point, 
one lets //W = % = l 5 . . . 5 T „ and //W = ^ 25 { = r n + 1, . . . , re. 

With the above as the basic setup, one can first apply change-detection 
methodology, and this has been done comprehensively in Appendix B. Ba- 
sically, it has been found that the bivariate tests for Feb-Jul, and Feb-Aug 
pairs as well as the multivariate test for all the three months, were found to 
be significant even though none of the univariate tests showed significance. 
The bivariate and multivariate analyses resulted in the change-point mle 
being f n = 14, so that a change in water discharges occurred subsequent to 
the year 1964. The analysis in the Appendix was quite supportive of the 
assumptions of both Gaussianity and independence. 
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Fig. 2. Time series plot of log transformed data on mean monthly water discharges of 
the Nacetinsky creek for the months of February, July and August for the years 1951-1990. 

We shall now implement the theoretical distribution derived in Section 3 
to the data in Figure 2 under the bivariate and trivariate cases. Based on 
T n = 14, we estimated the values of rj to be fjpj = 1.47, f)FA = 1-52 and fjpjA = 
1.60. Visualizing these as known values, we implemented the theoretical 
distribution for each of the three cases. We found the period 1960-1968 to 
yield confidence levels of 94.8%, 95.6% and 96.5%, respectively. Simulations 
suggest that the same period under both bivariate and trivariate estimated 
cases with true parameter values set at r?= 1.51 and n = 40, r = 14 yields a 
confidence level of 90%. Applying the conditional distribution of Cobb (1978) 
for the same data with an error rate of approximately 10~ 5 , we found that 
95% coverage probability for Feb-Jul is the period 1963-1971, for Feb- Aug 
the period is 1963-1969, and for Feb-Jul- Aug the period is obtained as 1963- 
1967. Clearly, for this particular data, Cobb's cmle seems to yield shorter 
confidence interval than the unconditional mle. However, under repeated 
samples for data of the same size with the true parameters set at 77 = 1.51 
and n = 40, r = 14, we found that the period 1960-1968 under Cobb's cmle 
yields a coverage probability of 88% under both bivariate and trivariate 
cases, thus showing a similar performance as the mle on average. 

6. Discussion. Asymptotic distribution of the change-point mle is quite 
complicated and an exact computable expression for the distribution of the 
mle has not been derived in the literature to date, even though Hinkley 
(1970, 1971, 1972) published his seminal work more than three decades back. 
Assuming the parameters before and after the unknown change-point to 
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be known, this investigation establishes an exact and yet computationally 
attractive form for the asymptotic distribution of the change-point mle, thus 
far not available in the literature. 

To have a better understanding of its performance, we carried out an 
empirical study to compare the distribution under known parameters with 
the case where the nuisance parameters remain unknown. We also compare 
the derived distribution with the conditional distribution of Cobb (1978) as 
well as assessing the robustness of the derived distribution for departures 
from normality. Simulations have shown good agreement between known 
and estimated cases except for the case where parameters are estimated and 
amount of change is relatively small. Also, both mle and cmle are quite 
robust to deviations from normality, for the most part. 

We have applied the derived change-point estimation methodology to 
compute the asymptotic distribution under both mle and cmle methods 
for the log transformed data on annual mean discharges for the months of 
February, July and August for the Nacetinsky creek for the years 1951-1990. 
At first it may appear that sample size of n = 40 may be somewhat small 
for asymptotics to apply. However, simulations under the estimated case 
for samples of this size show excellent accuracy in the univariate case (Ta- 
ble 1, rj = 1.5) and good accuracy in the bivariate case (Table 2, rj = 1.5). 
Detection methodology for this data set under univariate setup yields no 
significance for the presence of a change-point for any of the three months. 
However, change-detection under the multivariate setup shows significance 
for Feb-Jul and Feb-Aug in the bivariate case and also for the trivariate 
case of Feb- Jul- Aug. 

In summary, the methodology proposed in this article appears quite use- 
ful for practitioners in all areas, mainly because it is readily computable, 
and it is quite robust to deviations from the assumption Gaussianity. Also, 
sample size does not seem to be a serious concern while implementing the 
asymptotic result. In terms of future directions, it would be of interest to de- 
rive such computationally feasible distributions for other distributions such 
as exponential and Weibull in the continuous case and binomial and Poisson 
in the discrete case. 

APPENDIX A 

Proof of Theorem 3.1. The proof of the theorem essentially follows upon 
applying the following three lemmas into Theorem 2.1. 

The following lemma is well known [see, e.g., Shiryaev et al. (1994)], 
and will be given without proof. It should be noted that even though the 
original result was given for the continuous Brownian motion, the same can 
be applied for a random walk with negative drift. This lemma addresses 
the fundamental issue of establishing the distributions of M (and M*) in 
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a simple exponential form, thereby making the integrals in Theorem 2.1 
analytically tractable. 

Lemma 1. Let the random walk {S n ,n > 0} be as specified in (2.3). 
Then, for x>0, 



Pfmax5 m <x)=*( X -±^U1\ _ e -*JzE± 
\m<n J \ ayn J \ a 

= P(M < x) as n — > oo. 



x + nr/ 2 /2 



n 



1 - e' 



The following remark, which provides the complementary probability for 
M for strictly positive values (x > 0), plays an important role in the proof 
of the theorem. 

Remark. Note that P(M > x) = P(M > x\M > 0)P(M > 0) = ||G+||e- x , 
x > 0. 

The next lemma provides an analytical and convenient expression for 
P(T^~ > n n S- n £ dx). As can be seen from the proof of Lemma 3, this 
lemma is critical for carrying out the integrals in Theorem 2.1 in a fully 
analytical manner. 

Lemma 2. Let the random walk {S n ,n > 0} be as specified in (2.3). 
Then, for x>0, 



P(Tf > n n S n G dx) = t?" 1 ^! (Tf >n-l)Df 



x-S n - 1 + 7 1 2 /2 
V 



n> 1. 



PROOF. In light of (3.1), we have that, for x > 0, 
P{Tf >nnS n £ (0,a;]} 



'n-l 



f](S j >o)nS n e(o,x] 



.j=o 



(A.l) 



E 



E 



'n-l 



ij f| (5,- > 0)|P(X„ e (-5 n _i,x - ^i]|F^i) 



'n-l 



/ n > °) 
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-5 n _i + r? 2 /2 x-5 n _i + r/ 2 /2 



E 



I{T{ > n - 1) n \$ 



rj 7] 

x - Sn-t + r/ 2 /2 



n-l 



- $ 



-Sn-1 + ?? 2 /2 



n > 1. 



Thus, differentiating (A.l) with respect to x, the proof of Lemma 2 is now 
in order. □ 

The next lemma provides a manageable expression for the second term 
in Theorem 2.1. 

Lemma 3. The following holds: 

roc 

/ P(M*>x)P(T{- >nnS n £dx) = \\G* + \\E{e~ Sn I(T{' >n)}, n>l. 

Proof. Using Lemma 2, and the remark following Lemma 1, we note 
that 

/•oo 

/ P(M* > x)P(Tf > nf] S n € dx) 



V \\G*+\\E 



L >n-l) I e -(pi 



r/.r 



|G;||S{/(Tf > n - l)e- i "7(7 ? Z n > -5 n _i +77 2 /2)} 
|G;||^{e- 5 "/(Tf >n)}, n>l. 



□ 



Remarks regarding computational aspects of expressions in Theorem 3.1. 

Here, we first address computational issues of the two sequences {q n : n > 
1} and {q n '-n > 1} that appear in Theorem 3.1. Set b n = P(S n > 0) and 
b n = E{e~ Sn I{S n > 0)}, for n > 1. From Feller (1971), Volume II, page 416, 
and Chover, Ney and Wainger (1973), it is well known that the generating 
function of the sequences {q n '-n > 1} and {qn'-n > 1} 5 respectively, satisfy 
the following relationships: 



(A-2) £ 



s q n = exp< 



. n=l 



s n b n 



S n b r . 



n 



and £ s n q n = exp — ^ \ . 



n=l 



k n=l 



Note that the second equation in (A. 2) appears in Chover, Ney and 
Wainger (1973) as a type of a Laplace transform. In addition, both the 
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equations in (A. 2) may be obtained iteratively as simple consequences of 
the Weiner-Hopf factorization. In particular, the Leibnitz rule yields the 
following iterative relations, and thus enables one to compute {q n '-n > 1} 
and {q n :n> 1}: 

n— 1 n— 1 

nq n = ^ bn-jU and n{ ?n = ^ K-jqj , 

3=0 3=0 

(A.3) 

n = 1, 2, ... , and q~o = qo = 1- 

Note that, in the Gaussian case, 6 n = 
n > 1. 

Next, we demonstrate that the probabilities in Theorem 3.1 sum to one, 
and then provide an expression for the variance of the limiting distribution. 

From Hinkley (1970), and the remark after Lemma 1 above, it follows 
that 

poo 

P{^oo > 0) = P(M* > M, M* > 0) = / P(M < x)P(M* e dx) 

oo 

(1- ||G + ||e- x )||G + ||e- x (ix = l-(l- ||G+||) 2 /2. 

o+ 

Since -P(£oo = 0) = (1 — ||G+||) 2 , and £oo is symmetric, the claim that the 
probabilities for ^ sum to one follows immediately. The following expression 
for the variance may be derived in a somewhat tedious but straightforward 
manner: 

Var(^ 0O ) = 2{ J B"(l) + ( J B'(l)) 2 } 

- 2exp(-B(l) + S(l))(l - exp(-B(l))){S"(l) + (B'(l)) 2 }, 

where B(l) = E~ iW™, B'(l) = £~ ^ B"(l) = £~ ^ and fl(l), 
B'(l) and B"(l) are defined upon b n ,n > 1, in a similar manner. 

APPENDIX B 

Change-point detection for Nacetinsky water discharges. We first for- 
mulate the following hypotheses that test for the presence of an unknown 
change-point in the mean vector of the data series: 

H : /iW = . . . = ^(n) = w vs. 
(R1) H a : M W = • • • = M W =^ ^+1) = . . . = ^n) = 

where r € {1, . . . , n — 1} is the unknown change-point. Asymptotic theory of 
the generalized likelihood ratio statistic for testing the above hypothesis has 
been well addressed in the literature and the limiting result may be found in 
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Csorgd and Horvath (1997). It may be shown that the twice log-likelihood 
ratio statistic for testing the above hypothesis is 

(B.2) U n = max nlog(\± n \/\± t \), 

l<t<n— 1 

where ± t = n^EtlCY* - Ai,t)(Y,. - Al.tF + E?=m( Y * " ^,t)(Ji " 

P-2,t) T }, Ai,t = i _1 Ei=i Y i and A2,t = ( n -*) -1 Er=t+i Y ''* = 1 '---' n - The 

asymptotic distribution of the above statistic is based upon W n = 
(21oglognC/ n ) 1 / 2 — (2 log log n + | log log log n — logr(p/2)), where p denotes 
the number of parameters that change under the alternative hypothesis, and 
in this case we have p = d = 3. The limiting distribution of W n is given by 
the following double exponential form: 

(B.3) lim P[W n <t] = exp(-2e _t ). 

n— >oo 

The p-value is obtained based on a two-sided critical region of the above 
limiting distribution. When a test is significant, the maximum likelihood 
estimator of the unknown change-point r is obtained as the argument at 
which U n attains its maximum. In principle, we may apply the above pro- 
cedure for the data of each month individually with p = 1 , and also for data 
on each pair of months with p = 2. The results of the tests for all cases are 
presented in Table 3. Clearly, all univariate tests are not significant. Among 
the bivariate tests, the pair July-August is not significant, whereas the other 
two pairs yield significance. The multivariate test for all three months is also 
significant. The significance based upon the biviariate and multivariate tests 
takes into account the covariance structure in the data and hence should be 
believed more so than the univariate tests where no significance is found. 
The change-point mle is obtained as f = 14. 

At this point, we need to investigate the validity of the main assumptions, 
namely, constancy of the covariance matrix, Gaussianity and independence 
over time. The investigation regarding the covariance matrix requires that 





Table 3 




The statistic W for 


change in 


mean for various 


months 


a 


nd their p 


-values 




Months 


W 


p- value 


f 


Feb 


2.74 


0.1206 


15 


Jul 


1.86 


0.2674 


14 


Aug 


2.29 


0.1825 


14 


Feb-Jul 


3.59 


0.0539 


14 


Feb-Aug 


3.76 


0.0455 


14 


Jul-Aug 


1.90 


0.2593 


14 


Feb-Jul-Aug 


3.78 


0.0448 


14 
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we compute the deviation vector Di, i = 1, . . . , 40, from the estimated mean 
for each observation, taking into account the differences in the means before 
and after the estimated change-point. It is of interest then to know whether 
the covariance structure of the deviations remained constant throughout 
the sampling period. The generalized log-likelihood ratio statistic for the 
constancy of the covariance matrix over time against the alternative that 
the covariance matrix has changed at an unknown time is given by 

(B.4) ^= max log{|E 1:n |7(|E 1:t |*|E t+1:n |("-*))}, 

l<f<n — 1 

where | Si : t| and |£t+i :ri | are the usual estimators of the covariance matrix 
based on the first t and last n — t deviations, respectively. The limiting distri- 
bution of U* is obtained through the distribution of W* , where W* is defined 
upon U* in an analogous manner. It follows that p, the number of parame- 
ters that change in this case, is given by p = d{d + l)/2. The p- values for the 
univariate, bivariate and multivariate tests are reported in Table 4. Clearly, 
all tests are insignificant except the multivariate test. However, the signifi- 
cance is not particularly relevant since the change-point mle of 3 obtained 
in this case implies no change in the covariance structure, for all practical 
purposes. Thus, there is no evidence in the data against the assumption of 
stationarity of the covariance matrix. Utilizing the estimated change-point 
(f = 14), estimates for the mean vector before and after the change-point 
as well as the pooled estimator of the common covariance matrix are then 
obtained as Alf = (6.738, 7.137, 6.725), /2 2 f = (7.383,7.483,7.166) and 

0.365 -0.032 -0.029" 
-0.032 0.161 0.104 . 
-0.029 0.104 0.211 

It remains to be seen whether the assumptions of Gaussianity and inde- 
pendence over time are valid. We can verify this by utilizing the deviation 




Table 4 

The statistic W for change in variance for various 
■months and their p-values 



Months 


W 


p- value 


T 


Feb 


3.18 


0.0796 


3 


Jul 


1.91 


0.2556 


5 


Aug 


1.39 


0.3929 


2 


Feb-Jul 


3.02 


0.0927 


3 


Feb-Aug 


2.28 


0.1842 


2 


Jul-Aug 


2.32 


0.1788 


2 


Feb-Jul-Aug 


4.26 


0.0278 


3 
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vectors Di, i = 1, . . . , 40, and the covariance matrix Sf found above. Specifi- 
cally, if Di is multivariate normal, then it is well known that d\ = |||, _i is 

T 

approximately chi-square with 3 degrees of freedom i = 1, . . . ,40. The same 
can be applied for the bivariate case also with the degrees of freedom being 
2 in this case. Thus, one only needs to verify whether df, i = 1, . . . , 40 form a 
sample from the corresponding chi-square distribution. Upon applying the 
Anderson-Darling statistic, we found the p-value for the three months case 
to be 0.185. The corresponding p- values for Feb-Jul, Feb- Aug and Jul-Aug 
pairs were 0.244, 0.250 and 0.10, respectively. In the univariate case, we ap- 
plied the Anderson-Darling test for the deviations for each individual month 
and found the p- values to be 0.927, 0.530 and 0.177, respectively. Thus, the 
assumption of Gaussianity seems quite appropriate at each of the univariate, 
bivariate and multivariate levels. 

As for independence over time, we first tested each of the three deviation 
series for significance of both autocorrelations and partial autocorrelations 
up to the first twenty lags. The ACF and PACF plots for each individual 
series showed no evidence of significant correlations. We then computed the 
cross-correlations for each pair and found that these were also not significant 
and, thus, there was no indication that the assumption of independence 
over time was in violation. Overall, the change-point model with estimated 
parameters may be seen to fit the data quite well. 

Acknowledgments. The authors thank the Editor Michael Stein, the As- 
sociate Editor and two anonymous referees for their in-depth comments 
and suggestions that led to a substantial improvement in both content and 
presentation of the paper. We are especially thankful to Professor Daniela 
Jaruskova for providing us the data on Nacetinsky creek. 

REFERENCES 

Andrews, D. W. K. and Ploberger, W. (1994). Optimal tests when a nuisance param- 
eter is present only under the alternative. Econometrica 62 1383-1414. MR1303238 

ASMUSSEN, S. (1987). Applied Probability and Queues. Wiley, New York. MR0889893 

Basseville, M. and Nikiforov, I. V. (1993). Detection of Abrupt Changes: Theory and 
Application. Prentice Hall, Englewood Cliffs, NJ. MR1210954 

Borovkov, A. A. (1999). Asymptotically optimal solutions in the change-point problem. 
Theory Probab. Appl. 43 539-561. 

Braun, J. V. and Muller, H.-G. (1998). Statistical methods for DNA sequence segmen- 
tation. Statist. Sci. 13 142-162. 

Brodsky, B. E. and Darkhovsky, B. S. (1993). Nonparametric Methods in Change- 
point Problems. Springer, New York. MR1228205 

Brodsky, B. E. and Darkhovsky, B. S. (2000). Non-parametric Statistical Diagno- 
sis: Problems and Methods. Mathematics and Its Applications 509. Kluwer Academic, 
Dordrecht. 

Chen, J. and Gupta, A. K. (2000). Parametric Statistical Change Point Analysis. 
Birkhauser, New York. MR1761850 



24 S. B. FOTOPOULOS, V. K. JANDHYALA AND E. KHAPALOVA 



Chover, J., Ney, P. and Wainger, S. (1973). Functions on probability measures. J. 

Anal. Math. 26 255-302. MR0348393 
Cobb, G. W. (1978). The problem of the Nile: Conditional solution to a change-point 

problem. Biometrika 65 243-251. MR0513930 
CSORGO, M. and Horvath, L. (1997). Limit Theorems in Change-Point Analysis. Wiley, 

New York. 

DeGaetano, A. T. (2006). Attributes of several methods for detecting discontinuities 
in temperature series: Prospects for a hybrid homogenization procedure. J. Climate 9 
1646-1660. 

Fealy, R. and Sweeney, J. (2005). Detection of a possible change point in atmospheric 

variability in the North Atlantic and its effect on Scandinavian glacier mass balance. 

Int. J. Climatol. 25 1819-1833. 
Fearnhead, P. (2006). Exact and efficient Bayesian inference for multiple change-point 

problems. Stat. Comput. 16 203-213. MR2227396 
Fearnhead, P. and Liu, Z. (2007). On-line inference for multiple change points problems. 

J. Roy. Statist. Soc. Ser. B 69 589-605. MR2370070 
Feller, W. R. (1971). An Introduction to Probability Theory and Its Applications, Vol. 

II. Wiley, New York. MR0270403 
FOTOPOULOS, S. B. and Jandhyala, V. K. (2001). Maximum likelihood estimation of a 

change-point for exponentially distributed random variables. Statist. Probab. Lett. 51 

423-429. MR1820801 

FOTOPOULOS, S. B. (2009). The geometric convergence rate of the classical change-point 

estimate. Statist. Probab. Lett. 79 131-137. MR2483529 
GlRON, F. J., Moreno, E. and Casella, G. (2007). Objective Bayesian analysis of 

multiple changepoints for linear models (with discussion). In Bayesian Statistics 8 (J. 

M. Bernardo, M. J. Bayarri and J. O. Berger, eds) 227-252. Oxford Univ. Press, Oxford. 

MR2433195 

Gombay, E. and Horvath, L. (1997). An application of the likelihood method to change- 
point detection. Environmetrics 8 459-467. 

Hansen, B. E. (2000). Testing for structural change in conditional models. J. Economet- 
rics 97 93-115. MR1788819 

Hinkley, D. V. (1970). Inference about the change-point in a sequence of random vari- 
ables. Biometrika 57 1-17. MR0273727 

Hinkley, D. V. (1971). Inference about the change-point from cumulative sum tests. 
Biometrika 58 509-523. MR0312623 

Hinkley, D. V. (1972). Time ordered classification. Biometrika 59 509-523. MR0368317 

Hu, I. and Rukhin, A. L. (1995). A lower bound for error probability in change-point 
estimation. Statist. Smica 5 319-331. MR1329301 

Jandhyala, V. K. and Fotopoulos, S. B. (1999). Capturing the distributional behav- 
ior of the maximum likelihood estimator of a change-point. Biometrika 86 129-140. 
MR1688077 

Jandhyala, V. K. and Fotopoulos, S. B. (2001). Rate of convergence of the maximum 
likelihood estimate of a change-point. Sankhya Ser. A 63 277-285. MR1897454 

Jaruskova, D. (1996). Change-point measurement in meteorological measurement. Mon. 
Weather Rev. 124 1535-1543. 

Kaplan, A. Y. and Shishkin, S. L. (2000). Application of the change-point analysis to the 
investigation of the brain's electrical activity. In Non-Parametric Statistical Diagnosis: 
Problems and Methods (B. E. Brodsky and B. S. Darkhovsky, eds.) 333-388. Kluwer, 
Dordrecht. MR1862475 



EXACT ASYMPTOTIC DISTRIBUTION OF CHANGE-POINT MLE 



25 



Lai, T. L. (1995). Sequential change-point detection in quality control and dynamical 

systems. J. Roy. Statist. Soc. Ser. B 57 613-658. MR1354072 
Lebarbier, L. (2005). Detecting multiple change-points in the mean of Gaussian process 

by model selection. Sign. Proc. 85 717-736. 
Page, E. S. (1955). A test for a change in a parameter occurring at an unknown point. 

Biometrika 42 523-526. MR0072412 
Perreault, L., Bernier, J., Bobee, B. and Parent, E. (2000a). Bayesian change-point 

analysis in hydrometeorological time series. Part 1. Normal model revisited. J. Hydrol. 

235 221-241. 

Perreault, L., Bernier, J., Bobee, B. and Parent, E. (2000b). Bayesian change- 
point analysis in hydrometeorological time series. Part 2. Comparison of change-point 
models and forecasting. J. Hydrol. 235 242-263. 

Ruggieri, E., Herbert, T., Lawrence, K. T. and Lawrence, C. E. (2009). 
Change point method for detecting regime shifts in paleoclimatic time series: Ap- 
plication to S ls O time series of the Plio-Pleistocene. Paleoceanography 24 PA1204, 
DOI: 10.1029 /2007PA001568 . 

Seidou, O. and Ouarda, T. B. M. J. (2007). Recursion-based multiple changepoint de- 
tection in multiple linear regression and application to river streamflows. Water Resour. 
Res. 43, DOL10.1029/2006WR005021. 

Shiryaev, A. N., Kabanov, Y. M., Kramkov, D. O. and Melnikov, A. V. (1994). 
Towards the theory of pricing of options of both European and American types, II, 
continuous time. Theory Probab. Appl. 39 61-102. 

Wu, Y. (2005). Inference for Change-Point and Post-Change Means After a CUSUM 
Test. Lecture Notes in Math. 180. Springer, New York. MR2142337 

Wu, Q.-Z., Cheng, H.-Y. and Jeng, B.-S. (2005). Motion detection via change-point 
detection for cumulative histograms of ratio images. Pattern. Recog. Lett. 26 555-563. 

Zou, C.j QlU, P. and Hawkins, D. (2009). Nonparametric control chart for monitoring 
profiles using change point formulation and adaptive smoothing. Statist. Sinica 19 1337- 
1357. MR2536159 



S. B. FOTOPOULOS 

E. Khapalova 

Department of Management and Operations 
Washington State University 
Pullman, Washington 99164-4736 
USA 

E-MAIL: fotopo@wsu.edu 
clcna_k@wsu.edu 



V. K. Jandhyala 
Department of Statistics 
Washington State University 
Pullman, Washington 99164-3113 
USA 

E-MAIL: jandhyala@wsu.edu 



